================================================================================
REPLICATION ARCHIVE
================================================================================

Title:    Text as Behavior
Author:   Omar Wasow
Contact:  owasow@berkeley.edu
Journal:  Political Analysis
Date:     2026

================================================================================
OVERVIEW
================================================================================

This archive contains data and code to replicate all figures and tables in
"Text as Behavior" (Wasow 2026). The analysis examines open-ended survey
responses from ANES (2016/2020/2024), Afrobarometer, and experimental data.

================================================================================
DATA AVAILABILITY STATEMENT
================================================================================

This paper uses data from three sources:

1. AMERICAN NATIONAL ELECTION STUDIES (ANES)
   - 2016 Time Series Study
   - 2020 Time Series Study
   - 2024 Time Series Study (Preliminary Release)
   Access: Free registration required at https://electionstudies.org
   License: ANES Terms of Use (academic research)

2. AFROBAROMETER Round 6 (2016)
   - Merged data from 36 countries
   Access: Free download at https://www.afrobarometer.org
   License: Creative Commons Attribution

3. KMM SOCIAL EXCLUSION EXPERIMENT
   - Kuo, Malhotra & Mo (2017) replication data
   Access: Harvard Dataverse (doi:10.7910/DVN/GMWOY6)
   License: CC0 Public Domain

Pre-processed .Rdata files are included in text_data_output/ for convenience.
To replicate from raw data, download original files as described below.

================================================================================
COMPUTATIONAL REQUIREMENTS
================================================================================

SOFTWARE
--------
- R version 4.3.0 or higher
- RStudio (recommended, not required)
- LaTeX distribution (for PDF rendering): TeX Live 2023 or MacTeX 2023

R PACKAGES
----------
All required packages are installed automatically by install_packages.R.
See loaded_packages_and_versions.tsv for exact versions used.

Key packages: tidyverse 2.0.0, stargazer 5.2.3, kableExtra 1.4.0,
              ggplot2 3.5.2, haven 2.5.5, marginaleffects 0.28.0

HARDWARE
--------
Tested on: macOS Sonoma 14.x, Apple M1/M2, 16GB RAM
Expected runtime: ~10-15 minutes (full build from raw data)
                  ~3-5 minutes (PDF only, using pre-processed data)
Disk space: ~500MB for archive, ~2GB with raw data files

================================================================================
ARCHIVE CONTENTS
================================================================================

text_replication.Rproj     RStudio project file (open this first)
README.txt                 This file
loaded_packages_and_versions.tsv   R package versions used

text_code/
    make_script_rep.R      ** MASTER BUILD SCRIPT ** (run this)
    install_packages.R     Installs required R packages
    text_packages_rep.R    Loads packages and helper functions
    text_functions_rep.R   Project-specific functions
    custom_ggplot_themes.R Custom plot themes
    custom_table_functions_rep.R  Table formatting functions
    anes2016_processing_rep.R     ANES 2016 data processing
    anes2020_processing_rep.R     ANES 2020 data processing
    anes2024_processing_rep.R     ANES 2024 data processing
    afro_processing.R      Afrobarometer data processing
    aap_processing.R       KMM experiment data processing
    create_codebooks.R     Generates variable codebooks

text_docs/
    text_as_behavior_rep.Rmd      Main replication document (RMarkdown)
    text_bib_2025.bib             Bibliography
    cup-pan.cls                   Cambridge journal class file

text_data_output/          Pre-processed data files (.Rdata)
    anes2016_processed.Rdata
    anes2016_tidy_text.Rdata
    anes2020_merged.Rdata
    anes2024_merged.Rdata
    afrobarometer_processed.Rdata
    aap_processed.Rdata

text_data_raw/             (Empty - for raw data if replicating from source)

================================================================================
INSTRUCTIONS FOR REPLICATION
================================================================================

OPTION A: QUICK REPLICATION (Using Pre-Processed Data)
------------------------------------------------------

Uses existing .Rdata files in text_data_output/ to render the PDF.
This is faster and does not require downloading raw data.

From the command line:

   Rscript text_code/make_script_rep.R

Or in RStudio:

   1. Open text_replication.Rproj
   2. Open text_docs/text_as_behavior_rep.Rmd and click "Knit"


OPTION B: FULL REPLICATION (From Raw Data)
------------------------------------------

Processes raw survey data before rendering. Requires downloading
ANES, Afrobarometer, and KMM datasets first (see below).

1. Download raw data files to text_data_raw/ as specified below

2. From the command line:

   Rscript text_code/make_script_rep.R --from-raw

   Or in RStudio:

   source(here::here("text_code", "make_script_rep.R"))

   (Edit the script to set from_raw <- TRUE, or run processing
   scripts individually before rendering.)

================================================================================
RAW DATA FILE LOCATIONS
================================================================================

If replicating from raw data, download these files:

ANES 2016
---------
Source: https://electionstudies.org/data-center/2016-time-series-study/
Files (place in text_data_raw/ANES/):
  - anes_timeseries_2016_rawdata.txt
  - anes_timeseries_2016_redacted_openends.xlsx
  - anes_timeseries_2016_voteval.csv

ANES 2020
---------
Source: https://electionstudies.org/data-center/2020-time-series-study/
Files (place in text_data_raw/ANES/):
  - anes_timeseries_2020_stata_20220210.dta
  - anes_timeseries_2020_redactedopenends_excel_20211118.xlsx
  - anes_timeseries_2020_csv_VoterValidation.csv

ANES 2024
---------
Source: https://electionstudies.org/data-center/2024-time-series-study/
Files (place in text_data_raw/ANES/):
  - anes_timeseries_2024_spss_20250808.sav
  - anes_timeseries_2024_redactedopenends_excel_20250923.xlsx

AFROBAROMETER
-------------
Source: https://www.afrobarometer.org/survey-resource/merged-round-6-data-36-countries-2016/
File (place in text_data_raw/Afrobarometer/):
  - merged_r6_data_2016_36countries2.sav

KMM/SOCIAL EXCLUSION
--------------------
Source: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/GMWOY6
File (place in text_data_raw/):
  - experimentaldata_kmm_050916_final.dta

NOTE: All code requires data in "Original Format" not "Archival Format (.tab)"

================================================================================
OUTPUT
================================================================================

The main output is:
  text_docs/text_as_behavior_rep.pdf

Codebooks documenting variables (generated by create_codebooks.R):
  text_data_output/codebook_anes.pdf
  text_data_output/codebook_afrobarometer.pdf
  text_data_output/codebook_kmm.pdf

================================================================================
TROUBLESHOOTING
================================================================================

- If package installation fails, try: install.packages("pak"); pak::pak("tidyverse")
- LaTeX errors: Ensure TeX distribution is installed and updated
- Memory errors: Close other applications; some operations require 8GB+ RAM
- Path errors: Ensure working directory is project root (use here::here())

================================================================================
LICENSE
================================================================================

Code: MIT License
Data: See individual data source licenses above

================================================================================
